Problem Set 2

INFX 574

Prem Shah

Collaborated with: Pratik Damania, Gaurav Gohil, Aditya Wakade, Manasi Kulkarni

Importing Libraries

In [191]:
import numpy as np
import matplotlib.pyplot as plt
from numpy.linalg import inv
from IPython.display import Image
%matplotlib inline

1. Gradient Ascent

1.1 Implement 1D GA Algorithm

1. Manual Implementation

In [197]:
Image(filename = 'Images/q1.jpg')
Out[197]:

2. Implement 1D GA for f(x) = -x**2

In [82]:
#Function to implement one dimensional gradient ascent
def one_d_grad_ascent(lr,x_0,stopping_param_value):
    
    #lr                   -> Learning rate
    #x_0                  -> Starting point
    #Stopping_param_value -> Refers to the number of places after the decimal point to which the new and old values
    #                        will be compared. For example, if value is 2, two places after the decimal point for x_0 
    #                        and x_1 will be compared.
    
    closing_criteria = 0
    n_iter = 0
    X = list()
    Y = list()
    f_x = -1*(x_0**2)
    
    while(closing_criteria < 5):
        
        #Compute the derivative and the new value of function
        d_f_x = -2*x_0
        f_x = -1*(x_0**2)
        x_1 = x_0 + lr*d_f_x
        n_iter = n_iter + 1
        
        #We check the difference between new and old x values here according to the stopping parameter value.
        #If they remain the same for 5 times in a row, then the iterations stop else the bail-out number of iterations
        #is set at 11000
        if(round(x_1,stopping_param_value) == round (x_0,stopping_param_value)):
            closing_criteria = closing_criteria + 1
        elif(n_iter == 11000):
            closing_criteria = 5
        else:
            closing_criteria = 0
        
        print("\n")
        print("Iteration number:", n_iter)
        print("Function value (f_x) is:", f_x)
        print("Gradient value (d_f_x) is:", d_f_x)
        print("Input value (x_0) is:", x_0)

        X.append(x_1)
        Y.append(f_x)
        x_0 = x_1
        
    print("Total number of iterations are:", n_iter)
    plt.scatter(X,Y)
    plt.show()
    return(X,Y)

3. Experiment with different values

Learning Rate: 0.05, Round off to two decimals

In [83]:
X,Y = one_d_grad_ascent(0.05,-20,2)

Iteration number: 1
Function value (f_x) is: -400
Gradient value (d_f_x) is: 40
Input value (x_0) is: -20


Iteration number: 2
Function value (f_x) is: -324.0
Gradient value (d_f_x) is: 36.0
Input value (x_0) is: -18.0


Iteration number: 3
Function value (f_x) is: -262.44
Gradient value (d_f_x) is: 32.4
Input value (x_0) is: -16.2


Iteration number: 4
Function value (f_x) is: -212.57639999999995
Gradient value (d_f_x) is: 29.159999999999997
Input value (x_0) is: -14.579999999999998


Iteration number: 5
Function value (f_x) is: -172.18688399999996
Gradient value (d_f_x) is: 26.243999999999996
Input value (x_0) is: -13.121999999999998


Iteration number: 6
Function value (f_x) is: -139.47137603999997
Gradient value (d_f_x) is: 23.6196
Input value (x_0) is: -11.8098


Iteration number: 7
Function value (f_x) is: -112.97181459239998
Gradient value (d_f_x) is: 21.25764
Input value (x_0) is: -10.62882


Iteration number: 8
Function value (f_x) is: -91.50716981984398
Gradient value (d_f_x) is: 19.131876
Input value (x_0) is: -9.565938


Iteration number: 9
Function value (f_x) is: -74.12080755407362
Gradient value (d_f_x) is: 17.218688399999998
Input value (x_0) is: -8.609344199999999


Iteration number: 10
Function value (f_x) is: -60.037854118799636
Gradient value (d_f_x) is: 15.496819559999999
Input value (x_0) is: -7.748409779999999


Iteration number: 11
Function value (f_x) is: -48.6306618362277
Gradient value (d_f_x) is: 13.947137603999998
Input value (x_0) is: -6.973568801999999


Iteration number: 12
Function value (f_x) is: -39.39083608734444
Gradient value (d_f_x) is: 12.552423843599998
Input value (x_0) is: -6.276211921799999


Iteration number: 13
Function value (f_x) is: -31.90657723074899
Gradient value (d_f_x) is: 11.297181459239997
Input value (x_0) is: -5.648590729619999


Iteration number: 14
Function value (f_x) is: -25.844327556906677
Gradient value (d_f_x) is: 10.167463313315997
Input value (x_0) is: -5.083731656657998


Iteration number: 15
Function value (f_x) is: -20.933905321094407
Gradient value (d_f_x) is: 9.150716981984397
Input value (x_0) is: -4.5753584909921985


Iteration number: 16
Function value (f_x) is: -16.956463310086466
Gradient value (d_f_x) is: 8.235645283785956
Input value (x_0) is: -4.117822641892978


Iteration number: 17
Function value (f_x) is: -13.734735281170039
Gradient value (d_f_x) is: 7.412080755407361
Input value (x_0) is: -3.7060403777036806


Iteration number: 18
Function value (f_x) is: -11.12513557774773
Gradient value (d_f_x) is: 6.6708726798666245
Input value (x_0) is: -3.3354363399333122


Iteration number: 19
Function value (f_x) is: -9.011359817975661
Gradient value (d_f_x) is: 6.003785411879962
Input value (x_0) is: -3.001892705939981


Iteration number: 20
Function value (f_x) is: -7.2992014525602835
Gradient value (d_f_x) is: 5.403406870691965
Input value (x_0) is: -2.7017034353459826


Iteration number: 21
Function value (f_x) is: -5.91235317657383
Gradient value (d_f_x) is: 4.863066183622768
Input value (x_0) is: -2.431533091811384


Iteration number: 22
Function value (f_x) is: -4.789006073024802
Gradient value (d_f_x) is: 4.376759565260492
Input value (x_0) is: -2.188379782630246


Iteration number: 23
Function value (f_x) is: -3.8790949191500896
Gradient value (d_f_x) is: 3.9390836087344425
Input value (x_0) is: -1.9695418043672213


Iteration number: 24
Function value (f_x) is: -3.1420668845115727
Gradient value (d_f_x) is: 3.5451752478609984
Input value (x_0) is: -1.7725876239304992


Iteration number: 25
Function value (f_x) is: -2.545074176454374
Gradient value (d_f_x) is: 3.1906577230748985
Input value (x_0) is: -1.5953288615374492


Iteration number: 26
Function value (f_x) is: -2.0615100829280424
Gradient value (d_f_x) is: 2.8715919507674084
Input value (x_0) is: -1.4357959753837042


Iteration number: 27
Function value (f_x) is: -1.6698231671717143
Gradient value (d_f_x) is: 2.5844327556906674
Input value (x_0) is: -1.2922163778453337


Iteration number: 28
Function value (f_x) is: -1.3525567654090884
Gradient value (d_f_x) is: 2.3259894801216006
Input value (x_0) is: -1.1629947400608003


Iteration number: 29
Function value (f_x) is: -1.0955709799813615
Gradient value (d_f_x) is: 2.0933905321094404
Input value (x_0) is: -1.0466952660547202


Iteration number: 30
Function value (f_x) is: -0.8874124937849028
Gradient value (d_f_x) is: 1.8840514788984963
Input value (x_0) is: -0.9420257394492482


Iteration number: 31
Function value (f_x) is: -0.7188041199657713
Gradient value (d_f_x) is: 1.6956463310086467
Input value (x_0) is: -0.8478231655043234


Iteration number: 32
Function value (f_x) is: -0.5822313371722747
Gradient value (d_f_x) is: 1.526081697907782
Input value (x_0) is: -0.763040848953891


Iteration number: 33
Function value (f_x) is: -0.47160738310954253
Gradient value (d_f_x) is: 1.3734735281170039
Input value (x_0) is: -0.6867367640585019


Iteration number: 34
Function value (f_x) is: -0.38200198031872945
Gradient value (d_f_x) is: 1.2361261753053034
Input value (x_0) is: -0.6180630876526517


Iteration number: 35
Function value (f_x) is: -0.3094216040581709
Gradient value (d_f_x) is: 1.1125135577747731
Input value (x_0) is: -0.5562567788873866


Iteration number: 36
Function value (f_x) is: -0.2506314992871184
Gradient value (d_f_x) is: 1.0012622019972959
Input value (x_0) is: -0.5006311009986479


Iteration number: 37
Function value (f_x) is: -0.20301151442256593
Gradient value (d_f_x) is: 0.9011359817975663
Input value (x_0) is: -0.45056799089878313


Iteration number: 38
Function value (f_x) is: -0.1644393266822784
Gradient value (d_f_x) is: 0.8110223836178097
Input value (x_0) is: -0.4055111918089048


Iteration number: 39
Function value (f_x) is: -0.1331958546126455
Gradient value (d_f_x) is: 0.7299201452560287
Input value (x_0) is: -0.36496007262801433


Iteration number: 40
Function value (f_x) is: -0.10788864223624284
Gradient value (d_f_x) is: 0.6569281307304258
Input value (x_0) is: -0.3284640653652129


Iteration number: 41
Function value (f_x) is: -0.08738980021135671
Gradient value (d_f_x) is: 0.5912353176573832
Input value (x_0) is: -0.2956176588286916


Iteration number: 42
Function value (f_x) is: -0.07078573817119893
Gradient value (d_f_x) is: 0.5321117858916449
Input value (x_0) is: -0.26605589294582244


Iteration number: 43
Function value (f_x) is: -0.05733644791867114
Gradient value (d_f_x) is: 0.4789006073024804
Input value (x_0) is: -0.2394503036512402


Iteration number: 44
Function value (f_x) is: -0.04644252281412362
Gradient value (d_f_x) is: 0.43101054657223237
Input value (x_0) is: -0.21550527328611618


Iteration number: 45
Function value (f_x) is: -0.03761844347944013
Gradient value (d_f_x) is: 0.3879094919150091
Input value (x_0) is: -0.19395474595750456


Iteration number: 46
Function value (f_x) is: -0.0304709392183465
Gradient value (d_f_x) is: 0.3491185427235082
Input value (x_0) is: -0.1745592713617541


Iteration number: 47
Function value (f_x) is: -0.024681460766860664
Gradient value (d_f_x) is: 0.31420668845115735
Input value (x_0) is: -0.15710334422557867


Iteration number: 48
Function value (f_x) is: -0.019991983221157136
Gradient value (d_f_x) is: 0.2827860196060416
Input value (x_0) is: -0.1413930098030208


Iteration number: 49
Function value (f_x) is: -0.01619350640913728
Gradient value (d_f_x) is: 0.25450741764543744
Input value (x_0) is: -0.12725370882271872


Iteration number: 50
Function value (f_x) is: -0.013116740191401195
Gradient value (d_f_x) is: 0.22905667588089368
Input value (x_0) is: -0.11452833794044684


Iteration number: 51
Function value (f_x) is: -0.010624559555034968
Gradient value (d_f_x) is: 0.2061510082928043
Input value (x_0) is: -0.10307550414640215


Iteration number: 52
Function value (f_x) is: -0.008605893239578324
Gradient value (d_f_x) is: 0.18553590746352389
Input value (x_0) is: -0.09276795373176194


Iteration number: 53
Function value (f_x) is: -0.006970773524058444
Gradient value (d_f_x) is: 0.1669823167171715
Input value (x_0) is: -0.08349115835858575


Iteration number: 54
Function value (f_x) is: -0.005646326554487339
Gradient value (d_f_x) is: 0.15028408504545435
Input value (x_0) is: -0.07514204252272717


Iteration number: 55
Function value (f_x) is: -0.004573524509134745
Gradient value (d_f_x) is: 0.13525567654090892
Input value (x_0) is: -0.06762783827045446


Iteration number: 56
Function value (f_x) is: -0.003704554852399143
Gradient value (d_f_x) is: 0.12173010888681803
Input value (x_0) is: -0.06086505444340901


Iteration number: 57
Function value (f_x) is: -0.003000689430443306
Gradient value (d_f_x) is: 0.10955709799813622
Input value (x_0) is: -0.05477854899906811


Iteration number: 58
Function value (f_x) is: -0.0024305584386590776
Gradient value (d_f_x) is: 0.09860138819832259
Input value (x_0) is: -0.049300694099161296


Iteration number: 59
Function value (f_x) is: -0.0019687523353138525
Gradient value (d_f_x) is: 0.08874124937849033
Input value (x_0) is: -0.044370624689245165


Iteration number: 60
Function value (f_x) is: -0.0015946893916042207
Gradient value (d_f_x) is: 0.0798671244406413
Input value (x_0) is: -0.03993356222032065


Iteration number: 61
Function value (f_x) is: -0.001291698407199419
Gradient value (d_f_x) is: 0.07188041199657717
Input value (x_0) is: -0.03594020599828859


Iteration number: 62
Function value (f_x) is: -0.0010462757098315294
Gradient value (d_f_x) is: 0.06469237079691946
Input value (x_0) is: -0.03234618539845973


Iteration number: 63
Function value (f_x) is: -0.0008474833249635387
Gradient value (d_f_x) is: 0.05822313371722751
Input value (x_0) is: -0.029111566858613755


Iteration number: 64
Function value (f_x) is: -0.0006864614932204663
Gradient value (d_f_x) is: 0.05240082034550476
Input value (x_0) is: -0.02620041017275238


Iteration number: 65
Function value (f_x) is: -0.0005560338095085777
Gradient value (d_f_x) is: 0.047160738310954284
Input value (x_0) is: -0.023580369155477142


Iteration number: 66
Function value (f_x) is: -0.000450387385701948
Gradient value (d_f_x) is: 0.042444664479858854
Input value (x_0) is: -0.021222332239929427


Iteration number: 67
Function value (f_x) is: -0.0003648137824185778
Gradient value (d_f_x) is: 0.03820019803187297
Input value (x_0) is: -0.019100099015936484


Iteration number: 68
Function value (f_x) is: -0.00029549916375904805
Gradient value (d_f_x) is: 0.03438017822868567
Input value (x_0) is: -0.017190089114342836


Iteration number: 69
Function value (f_x) is: -0.00023935432264482896
Gradient value (d_f_x) is: 0.030942160405817105
Input value (x_0) is: -0.015471080202908553


Iteration number: 70
Function value (f_x) is: -0.00019387700134231142
Gradient value (d_f_x) is: 0.027847944365235393
Input value (x_0) is: -0.013923972182617697


Iteration number: 71
Function value (f_x) is: -0.00015704037108727224
Gradient value (d_f_x) is: 0.025063149928711854
Input value (x_0) is: -0.012531574964355927


Iteration number: 72
Function value (f_x) is: -0.0001272027005806905
Gradient value (d_f_x) is: 0.022556834935840667
Input value (x_0) is: -0.011278417467920333


Iteration number: 73
Function value (f_x) is: -0.0001030341874703593
Gradient value (d_f_x) is: 0.0203011514422566
Input value (x_0) is: -0.0101505757211283


Iteration number: 74
Function value (f_x) is: -8.345769185099102e-05
Gradient value (d_f_x) is: 0.018271036298030938
Input value (x_0) is: -0.009135518149015469
Total number of iterations are: 74

Learning Rate: 0.1, Round off to five decimals

In [82]:
X,Y = one_d_grad_ascent(0.1,49,5)

Iteration number: 1
Function value (f_x) is: -2401
Gradient value (d_f_x) is: -98
Input value (x_0) is: 49


Iteration number: 2
Function value (f_x) is: -1536.6400000000003
Gradient value (d_f_x) is: -78.4
Input value (x_0) is: 39.2


Iteration number: 3
Function value (f_x) is: -983.4496000000001
Gradient value (d_f_x) is: -62.720000000000006
Input value (x_0) is: 31.360000000000003


Iteration number: 4
Function value (f_x) is: -629.4077440000001
Gradient value (d_f_x) is: -50.176
Input value (x_0) is: 25.088


Iteration number: 5
Function value (f_x) is: -402.82095616
Gradient value (d_f_x) is: -40.1408
Input value (x_0) is: 20.0704


Iteration number: 6
Function value (f_x) is: -257.8054119424
Gradient value (d_f_x) is: -32.11264
Input value (x_0) is: 16.05632


Iteration number: 7
Function value (f_x) is: -164.995463643136
Gradient value (d_f_x) is: -25.690112
Input value (x_0) is: 12.845056


Iteration number: 8
Function value (f_x) is: -105.59709673160702
Gradient value (d_f_x) is: -20.5520896
Input value (x_0) is: 10.2760448


Iteration number: 9
Function value (f_x) is: -67.5821419082285
Gradient value (d_f_x) is: -16.44167168
Input value (x_0) is: 8.22083584


Iteration number: 10
Function value (f_x) is: -43.25257082126623
Gradient value (d_f_x) is: -13.153337343999999
Input value (x_0) is: 6.576668671999999


Iteration number: 11
Function value (f_x) is: -27.681645325610386
Gradient value (d_f_x) is: -10.522669875199998
Input value (x_0) is: 5.261334937599999


Iteration number: 12
Function value (f_x) is: -17.71625300839065
Gradient value (d_f_x) is: -8.41813590016
Input value (x_0) is: 4.20906795008


Iteration number: 13
Function value (f_x) is: -11.338401925370018
Gradient value (d_f_x) is: -6.734508720128
Input value (x_0) is: 3.367254360064


Iteration number: 14
Function value (f_x) is: -7.256577232236811
Gradient value (d_f_x) is: -5.3876069761024
Input value (x_0) is: 2.6938034880512


Iteration number: 15
Function value (f_x) is: -4.644209428631558
Gradient value (d_f_x) is: -4.310085580881919
Input value (x_0) is: 2.1550427904409597


Iteration number: 16
Function value (f_x) is: -2.972294034324197
Gradient value (d_f_x) is: -3.4480684647055355
Input value (x_0) is: 1.7240342323527678


Iteration number: 17
Function value (f_x) is: -1.902268181967486
Gradient value (d_f_x) is: -2.7584547717644283
Input value (x_0) is: 1.3792273858822142


Iteration number: 18
Function value (f_x) is: -1.217451636459191
Gradient value (d_f_x) is: -2.2067638174115425
Input value (x_0) is: 1.1033819087057712


Iteration number: 19
Function value (f_x) is: -0.7791690473338823
Gradient value (d_f_x) is: -1.765411053929234
Input value (x_0) is: 0.882705526964617


Iteration number: 20
Function value (f_x) is: -0.49866819029368453
Gradient value (d_f_x) is: -1.4123288431433871
Input value (x_0) is: 0.7061644215716936


Iteration number: 21
Function value (f_x) is: -0.3191476417879582
Gradient value (d_f_x) is: -1.1298630745147098
Input value (x_0) is: 0.5649315372573549


Iteration number: 22
Function value (f_x) is: -0.20425449074429322
Gradient value (d_f_x) is: -0.9038904596117678
Input value (x_0) is: 0.4519452298058839


Iteration number: 23
Function value (f_x) is: -0.13072287407634764
Gradient value (d_f_x) is: -0.7231123676894142
Input value (x_0) is: 0.3615561838447071


Iteration number: 24
Function value (f_x) is: -0.0836626394088625
Gradient value (d_f_x) is: -0.5784898941515314
Input value (x_0) is: 0.2892449470757657


Iteration number: 25
Function value (f_x) is: -0.053544089221671996
Gradient value (d_f_x) is: -0.4627919153212251
Input value (x_0) is: 0.23139595766061255


Iteration number: 26
Function value (f_x) is: -0.03426821710187008
Gradient value (d_f_x) is: -0.3702335322569801
Input value (x_0) is: 0.18511676612849004


Iteration number: 27
Function value (f_x) is: -0.021931658945196855
Gradient value (d_f_x) is: -0.2961868258055841
Input value (x_0) is: 0.14809341290279204


Iteration number: 28
Function value (f_x) is: -0.014036261724925985
Gradient value (d_f_x) is: -0.23694946064446726
Input value (x_0) is: 0.11847473032223363


Iteration number: 29
Function value (f_x) is: -0.008983207503952631
Gradient value (d_f_x) is: -0.18955956851557382
Input value (x_0) is: 0.09477978425778691


Iteration number: 30
Function value (f_x) is: -0.005749252802529685
Gradient value (d_f_x) is: -0.15164765481245907
Input value (x_0) is: 0.07582382740622953


Iteration number: 31
Function value (f_x) is: -0.003679521793618998
Gradient value (d_f_x) is: -0.12131812384996725
Input value (x_0) is: 0.060659061924983625


Iteration number: 32
Function value (f_x) is: -0.0023548939479161586
Gradient value (d_f_x) is: -0.0970544990799738
Input value (x_0) is: 0.0485272495399869


Iteration number: 33
Function value (f_x) is: -0.0015071321266663417
Gradient value (d_f_x) is: -0.07764359926397904
Input value (x_0) is: 0.03882179963198952


Iteration number: 34
Function value (f_x) is: -0.0009645645610664586
Gradient value (d_f_x) is: -0.06211487941118323
Input value (x_0) is: 0.031057439705591616


Iteration number: 35
Function value (f_x) is: -0.0006173213190825335
Gradient value (d_f_x) is: -0.049691903528946584
Input value (x_0) is: 0.024845951764473292


Iteration number: 36
Function value (f_x) is: -0.00039508564421282137
Gradient value (d_f_x) is: -0.039753522823157264
Input value (x_0) is: 0.019876761411578632


Iteration number: 37
Function value (f_x) is: -0.0002528548122962056
Gradient value (d_f_x) is: -0.03180281825852581
Input value (x_0) is: 0.015901409129262904


Iteration number: 38
Function value (f_x) is: -0.00016182707986957162
Gradient value (d_f_x) is: -0.025442254606820647
Input value (x_0) is: 0.012721127303410323


Iteration number: 39
Function value (f_x) is: -0.00010356933111652584
Gradient value (d_f_x) is: -0.020353803685456518
Input value (x_0) is: 0.010176901842728259


Iteration number: 40
Function value (f_x) is: -6.628437191457653e-05
Gradient value (d_f_x) is: -0.016283042948365214
Input value (x_0) is: 0.008141521474182607


Iteration number: 41
Function value (f_x) is: -4.242199802532898e-05
Gradient value (d_f_x) is: -0.013026434358692171
Input value (x_0) is: 0.006513217179346086


Iteration number: 42
Function value (f_x) is: -2.7150078736210544e-05
Gradient value (d_f_x) is: -0.010421147486953736
Input value (x_0) is: 0.005210573743476868


Iteration number: 43
Function value (f_x) is: -1.737605039117475e-05
Gradient value (d_f_x) is: -0.00833691798956299
Input value (x_0) is: 0.004168458994781495


Iteration number: 44
Function value (f_x) is: -1.1120672250351841e-05
Gradient value (d_f_x) is: -0.006669534391650392
Input value (x_0) is: 0.003334767195825196


Iteration number: 45
Function value (f_x) is: -7.117230240225179e-06
Gradient value (d_f_x) is: -0.005335627513320314
Input value (x_0) is: 0.002667813756660157


Iteration number: 46
Function value (f_x) is: -4.555027353744115e-06
Gradient value (d_f_x) is: -0.0042685020106562515
Input value (x_0) is: 0.0021342510053281257


Iteration number: 47
Function value (f_x) is: -2.9152175063962338e-06
Gradient value (d_f_x) is: -0.003414801608525001
Input value (x_0) is: 0.0017074008042625005


Iteration number: 48
Function value (f_x) is: -1.8657392040935896e-06
Gradient value (d_f_x) is: -0.002731841286820001
Input value (x_0) is: 0.0013659206434100005


Iteration number: 49
Function value (f_x) is: -1.1940730906198974e-06
Gradient value (d_f_x) is: -0.002185473029456001
Input value (x_0) is: 0.0010927365147280004


Iteration number: 50
Function value (f_x) is: -7.642067779967344e-07
Gradient value (d_f_x) is: -0.0017483784235648007
Input value (x_0) is: 0.0008741892117824004


Iteration number: 51
Function value (f_x) is: -4.8909233791791e-07
Gradient value (d_f_x) is: -0.0013987027388518405
Input value (x_0) is: 0.0006993513694259203


Iteration number: 52
Function value (f_x) is: -3.1301909626746244e-07
Gradient value (d_f_x) is: -0.0011189621910814725
Input value (x_0) is: 0.0005594810955407363


Iteration number: 53
Function value (f_x) is: -2.0033222161117597e-07
Gradient value (d_f_x) is: -0.000895169752865178
Input value (x_0) is: 0.000447584876432589


Iteration number: 54
Function value (f_x) is: -1.282126218311526e-07
Gradient value (d_f_x) is: -0.0007161358022921423
Input value (x_0) is: 0.0003580679011460712


Iteration number: 55
Function value (f_x) is: -8.205607797193767e-08
Gradient value (d_f_x) is: -0.0005729086418337139
Input value (x_0) is: 0.00028645432091685695


Iteration number: 56
Function value (f_x) is: -5.251588990204011e-08
Gradient value (d_f_x) is: -0.0004583269134669711
Input value (x_0) is: 0.00022916345673348556


Iteration number: 57
Function value (f_x) is: -3.361016953730567e-08
Gradient value (d_f_x) is: -0.0003666615307735769
Input value (x_0) is: 0.00018333076538678845


Iteration number: 58
Function value (f_x) is: -2.1510508503875633e-08
Gradient value (d_f_x) is: -0.00029332922461886154
Input value (x_0) is: 0.00014666461230943077


Iteration number: 59
Function value (f_x) is: -1.3766725442480403e-08
Gradient value (d_f_x) is: -0.00023466337969508923
Input value (x_0) is: 0.00011733168984754461


Iteration number: 60
Function value (f_x) is: -8.810704283187457e-09
Gradient value (d_f_x) is: -0.00018773070375607137
Input value (x_0) is: 9.386535187803568e-05


Iteration number: 61
Function value (f_x) is: -5.638850741239973e-09
Gradient value (d_f_x) is: -0.0001501845630048571
Input value (x_0) is: 7.509228150242855e-05


Iteration number: 62
Function value (f_x) is: -3.6088644743935828e-09
Gradient value (d_f_x) is: -0.00012014765040388568
Input value (x_0) is: 6.007382520194284e-05


Iteration number: 63
Function value (f_x) is: -2.3096732636118932e-09
Gradient value (d_f_x) is: -9.611812032310855e-05
Input value (x_0) is: 4.8059060161554275e-05


Iteration number: 64
Function value (f_x) is: -1.4781908887116119e-09
Gradient value (d_f_x) is: -7.689449625848685e-05
Input value (x_0) is: 3.844724812924342e-05


Iteration number: 65
Function value (f_x) is: -9.460421687754316e-10
Gradient value (d_f_x) is: -6.151559700678947e-05
Input value (x_0) is: 3.075779850339474e-05


Iteration number: 66
Function value (f_x) is: -6.054669880162761e-10
Gradient value (d_f_x) is: -4.921247760543158e-05
Input value (x_0) is: 2.460623880271579e-05


Iteration number: 67
Function value (f_x) is: -3.874988723304168e-10
Gradient value (d_f_x) is: -3.9369982084345266e-05
Input value (x_0) is: 1.9684991042172633e-05


Iteration number: 68
Function value (f_x) is: -2.479992782914667e-10
Gradient value (d_f_x) is: -3.149598566747621e-05
Input value (x_0) is: 1.5747992833738105e-05


Iteration number: 69
Function value (f_x) is: -1.587195381065387e-10
Gradient value (d_f_x) is: -2.5196788533980967e-05
Input value (x_0) is: 1.2598394266990484e-05


Iteration number: 70
Function value (f_x) is: -1.0158050438818475e-10
Gradient value (d_f_x) is: -2.0157430827184773e-05
Input value (x_0) is: 1.0078715413592387e-05


Iteration number: 71
Function value (f_x) is: -6.501152280843824e-11
Gradient value (d_f_x) is: -1.612594466174782e-05
Input value (x_0) is: 8.06297233087391e-06


Iteration number: 72
Function value (f_x) is: -4.160737459740048e-11
Gradient value (d_f_x) is: -1.2900755729398255e-05
Input value (x_0) is: 6.450377864699128e-06


Iteration number: 73
Function value (f_x) is: -2.6628719742336303e-11
Gradient value (d_f_x) is: -1.0320604583518604e-05
Input value (x_0) is: 5.160302291759302e-06


Iteration number: 74
Function value (f_x) is: -1.7042380635095236e-11
Gradient value (d_f_x) is: -8.256483666814883e-06
Input value (x_0) is: 4.128241833407442e-06


Iteration number: 75
Function value (f_x) is: -1.0907123606460949e-11
Gradient value (d_f_x) is: -6.605186933451906e-06
Input value (x_0) is: 3.302593466725953e-06


Iteration number: 76
Function value (f_x) is: -6.980559108135007e-12
Gradient value (d_f_x) is: -5.284149546761525e-06
Input value (x_0) is: 2.6420747733807624e-06


Iteration number: 77
Function value (f_x) is: -4.467557829206404e-12
Gradient value (d_f_x) is: -4.2273196374092195e-06
Input value (x_0) is: 2.1136598187046098e-06


Iteration number: 78
Function value (f_x) is: -2.8592370106920984e-12
Gradient value (d_f_x) is: -3.3818557099273756e-06
Input value (x_0) is: 1.6909278549636878e-06
Total number of iterations are: 78

From the above implementation, we observe that as we take a more stringent stopping criteria, the number of iterations increases but since the function we have is one dimensional, we are not able to see a difference in the graph.

Also, as we take a smaller learning rate, the updated values are marginally updated and hence it will take more number of iterations to converge and satsify the defined stopping criteria.

1.2 Implement the 2d Version

4. Prove that x'Ax is a quadratic function

In [193]:
Image(filename = 'Images/q4.jpg')
Out[193]:

5. What is the condition that the problem has a single maximum?

In [194]:
Image(filename = 'Images/q5.jpg')
Out[194]:
In [195]:
Image(filename = 'Images/q5_2.jpg')
Out[195]:

6. Compute the gradient vector of this function

In [196]:
Image(filename = 'Images/q6.jpg')
Out[196]:

7. Implementation of 2D Gradient Ascent

In [164]:
x_0 = np.matrix('2;-3')
a = np.matrix('1,2;2,8')

def f(x_0):
    #Compute the derivative and the new value of function
    f = -(x_0.T @ a @ x_0) 
    return(f)

def grad(f):
    dv = -(a.T + a) @ f
    return(dv)

def two_d_grad_ascent(x_0, lr, stopping_param_value):
    
    #lr                   -> Learning rate
    #x_0                  -> Starting point
    #stopping_param_value -> Refers to the number of places after the decimal point to which the new and old values
    #                        will be compared. For example, if value is 2, two places after the decimal point for x_0 
    #                        and x_1 will be compared.
    
    closing_criteria = 0
    n_iter = 0
    x_coord = float(x_0[0])
    y_coord = float(x_0[1])
    X = [x_coord]
    Y = [y_coord]
    
    while(closing_criteria<60):
        #Compute the derivative and the new value of function
        func = f(x_0)
        der = grad(x_0)
        
        x_1 = x_0 + lr*der
        
        #We check the difference between new and old x values here according to the stopping parameter value.
        #If they remain the same for 60 times in a row, then the iterations stop else the bail-out number of iterations
        #is set at 11000
        if(round(float(x_1[0]), stopping_param_value) == round(float(x_0[0]), stopping_param_value) and round(float(x_1[1]), stopping_param_value) == round(float(x_0[1]), stopping_param_value)):
            closing_criteria = closing_criteria + 1
        elif(n_iter == 11000):
            closing_criteria = 60
        else:
            closing_criteria = 0
            
        X.append(float(x_1[0]))
        Y.append(float(x_1[1]))
        x_0 = x_1
        n_iter = n_iter + 1
    
    print('Point of Convergence: ',x_1)
    print('No. of iterations',n_iter)
    

    
    return (X,Y)

Learning Rate: 0.1, Round off to two decimals

In [26]:
X,Y = two_d_grad_ascent(x_0, 0.1,2)
Point of Convergence:  [[ 1.30125709e-05]
 [-3.45567607e-06]]
No. of iterations 124

Learning Rate: 0.01, Round off to two decimals

In [32]:
X,Y = two_d_grad_ascent(x_0, 0.01,2)
Point of Convergence:  [[ 0.00849664]
 [-0.0022564 ]]
No. of iterations 608

Learning Rate: 0.1, Round off to six decimals

In [190]:
X,Y = two_d_grad_ascent(x_0, 0.1,6)
Point of Convergence:  [[ 1.24332787e-09]
 [-3.30183666e-10]]
No. of iterations 218

As we can see from the above results, as the learning rate is decreasing, the number of iterations decrease since convergence is achieved later. Also, a stricter stopping criteria does the same.

8. Visualize your results

Learning Rate: 0.1, Round off to six decimals

In [31]:
## create 30x30 grid matrix
n = 30
ex1 = np.linspace(-10, 10, num=n)
ex2 = np.linspace(-10, 10, num=n)
grid1, grid2 = np.meshgrid(ex1, ex2)
## fill the grid via looping.  You may prefer to use ufuncs instead.
z = np.empty_like(grid1)
for i in range(z.shape[0]):
    for j in range(z.shape[1]):
        
        x = np.array([grid1[i,j], grid2[i,j]])
        z[i,j] = f(x)

        ## Make the plot
plt.figure(figsize=(11,10))

p = plt.contour(grid1, grid2, z,
                levels = [-60, -45,-30, -15, -10, -5, 0],
                colors = 'black')
plt.clabel(p, inline=1, fontsize=10)

plt.scatter(X,Y, c='red', s=100)
plt.plot(X,Y, c='red')
plt.show()

Learning Rate: 0.01, Round off to two decimals

In [33]:
n = 30
ex1 = np.linspace(-10, 10, num=n)
ex2 = np.linspace(-10, 10, num=n)
grid1, grid2 = np.meshgrid(ex1, ex2)
## fill the grid via looping.  You may prefer to use ufuncs instead.
z = np.empty_like(grid1)
for i in range(z.shape[0]):
    for j in range(z.shape[1]):
        
        x = np.array([grid1[i,j], grid2[i,j]])
        z[i,j] = f(x)

        ## Make the plot
plt.figure(figsize=(11,10))

p = plt.contour(grid1, grid2, z,
                levels = [-60, -45,-30, -15, -10, -5, 0],
                colors = 'black')
plt.clabel(p, inline=1, fontsize=10)

plt.scatter(X,Y, c='red', s=100)
plt.plot(X,Y, c='red')
plt.show()

Here, also we see the same results that stricter stopping criteria and smaller learning rates result in higher number of iterations

9. 5D Gradient Ascent

In [161]:
x_0 = np.matrix('10;20;30;40;50')
A = np.matrix('11,4,7,10,13;4,17,10,13,16;7,10,23,16,19;10,13,16,29,22;13,16,19,22,35')

def f(x_0):
    f = -(x_0.T @ A @ x_0) 
    return(f)

def grad(f):
    dv = -(A.T + A) @ f
    return(dv)

def five_d_grad_ascent(x_0, lr, stopping_param_value):
    
    #lr                   -> Learning rate
    #x_0                  -> Starting point
    #stopping_param_value -> Refers to the number of places after the decimal point to which the new and old values
    #                        will be compared. For example, if value is 2, two places after the decimal point for x_0 
    #                        and x_1 will be compared. Also, we out a default break when the function crosses 11000 iterations
    #                        and is not able to converge.
    
    closing_criteria = 0
    n_iter = 0
    x_coord = float(x_0[0])
    y_coord = float(x_0[1])
    X = [x_coord]
    Y = [y_coord]
    
    while(closing_criteria<20):
        func = f(x_0)
        der = grad(x_0)
        
        x_1 = x_0 + lr*der
        
        #We check the difference between new and old x values here according to the stopping parameter value.
        #If they remain the same for 20 times in a row, then the iterations stop else the bail-out number of iterations
        #is set at 11000
        if(round(float(x_1[0]),stopping_param_value) == round(float(x_0[0]),stopping_param_value) and round(float(x_1[1]),stopping_param_value) == round(float(x_0[1]),2) and 
           round(float(x_1[2]),stopping_param_value) == round(float(x_0[2]),stopping_param_value) and round(float(x_1[3]),stopping_param_value) == round(float(x_0[3]),stopping_param_value) and 
           round(float(x_1[4]),stopping_param_value) == round(float(x_0[4]),stopping_param_value)):
            closing_criteria = closing_criteria + 1
        elif(n_iter == 11000):
            closing_criteria = 20

        else:
            closing_criteria = 0
            
        X.append(float(x_1[0]))
        Y.append(float(x_1[1]))
        x_0 = x_1
        n_iter = n_iter + 1

        #print(x_1)
    print("Point of Convergence:",x_0)
    print("No. of iterations:", n_iter)
    #print(Y)
    return (X,Y)
In [35]:
x_0 = np.matrix('10;20;30;40;50')
A = np.matrix('11,4,7,10,13;4,17,10,13,16;7,10,23,16,19;10,13,16,29,22;13,16,19,22,35')
X,Y = five_d_grad_ascent(x_0, 0.01,2)
Point of Convergence: [[-0.00104479]
 [-0.00061318]
 [-0.00018157]
 [ 0.00025003]
 [ 0.00068164]]
No. of iterations: 117
In [36]:
x_0 = np.matrix('10;20;30;40;50')
A = np.matrix('11,4,7,10,13;4,17,10,13,16;7,10,23,16,19;10,13,16,29,22;13,16,19,22,35')
X,Y = five_d_grad_ascent(x_0, 0.01,6)
Point of Convergence: [[-1.05704263e-07]
 [-6.20373700e-08]
 [-1.83704775e-08]
 [ 2.52964150e-08]
 [ 6.89633075e-08]]
No. of iterations: 237
In [163]:
x_0 = np.matrix('10;20;30;40;50')
A = np.matrix('11,4,7,10,13;4,17,10,13,16;7,10,23,16,19;10,13,16,29,22;13,16,19,22,35')
X,Y = five_d_grad_ascent(x_0, 0.006,5)
Point of Convergence: [[-1.93343512e-06]
 [-1.13472463e-06]
 [-3.36014136e-07]
 [ 4.62696358e-07]
 [ 1.26140685e-06]]
No. of iterations: 337
In [40]:
x_0 = np.matrix('10;20;30;40;50')
A = np.matrix('11,4,7,10,13;4,17,10,13,16;7,10,23,16,19;10,13,16,29,22;13,16,19,22,35')
X,Y = five_d_grad_ascent(x_0, 0.1,9)
c:\users\iguest\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:32: RuntimeWarning: invalid value encountered in add
Point of Convergence: [[nan]
 [nan]
 [nan]
 [nan]
 [nan]]
No. of iterations: 11001

For a higher learning rate, the 5D gradient ascent is unable to converge before reaching the stopping criteria. We see that as the number of dimensions increasing it becomes more and more difficult to specify learning rates in order for the ascent to converge.

In general, the following observations were made:

1. Learning rate decreases, number of iterations increase 
2. The number of iterations depend on the distance of the input vector from the maxima.

1.3 Condition Numbers

10. Compute the Condition Number for matrix A

In [43]:
a=np.matrix('11 4 7 10 13;4 17 10 13 16;7 10 23 16 19;10 13 16 29 22;13 16 19 22 35')

new=np.linalg.eig(a)
condition_num=(new[0].max())/(new[0].min())
print(condition_num)

print(np.linalg.cond(a))
22.037957076517483
22.037957076517483

The numpy method of condition numbers does not take into consideration the root of the fraction obtained.

11. Show on computer that the matrix is singular

In [71]:
A=np.matrix('1 1;1 1')

#Compute the determinant.
np.linalg.det(A)
Out[71]:
0.0

Here, we can see that the determinant of the matrix is 0. This is the condition for a matrix to be singular. Hence we can say that the matrix is singular.

12. Run algorithm for different cases

In [45]:
def func(alpha,A):
    c = A + alpha*np.eye(2)
    eigenvalue = np.linalg.eig(c)
    condition_number = np.linalg.cond(c)
    return eigenvalue,condition_number

Analyzing the effect of alpha

In [46]:
A = np.matrix('1,1;1,1')
eigenvalue,condition_num = func(0.01,A)
print(eigenvalue)
print(condition_num)
(array([2.01, 0.01]), matrix([[ 0.70710678, -0.70710678],
        [ 0.70710678,  0.70710678]]))
201.00000000000267
In [133]:
eigenvalue,condition_num = func(1,A)
print(eigenvalue)
print(condition_num)
(array([3., 1.]), matrix([[ 0.70710678, -0.70710678],
        [ 0.70710678,  0.70710678]]))
2.999999999999999
In [134]:
eigenvalue,condition_num = func(100,A)
print(eigenvalue)
print(condition_num)
(array([102., 100.]), matrix([[ 0.70710678, -0.70710678],
        [ 0.70710678,  0.70710678]]))
1.0199999999999998

Now we plot the GA for different alpha values

In [51]:
# accept the quadratic eq
A = np.matrix('1,2;2,8')

# accept x y values
x_0 = np.matrix('9;-10')

A = A + (0.01*np.eye(2))
def f(x_0):
    f = -(x_0.T @ A @ x_0) 
    return(f)

def grad(f):
    dv = -(A.T + A) @ f
    return(dv)
X,Y = grad_desc_2d(x_0, 0.01, 1)

n = 30
ex1 = np.linspace(-10, 10, num=n)
ex2 = np.linspace(-10, 10, num=n)
grid1, grid2 = np.meshgrid(ex1, ex2)
## fill the grid via looping.  You may prefer to use ufuncs instead.
z = np.empty_like(grid1)
for i in range(z.shape[0]):
    for j in range(z.shape[1]):
        
        x = np.array([grid1[i,j], grid2[i,j]])
        z[i,j] = f(x)

        ## Make the plot
plt.figure(figsize=(9,9))


p = plt.contour(grid1, grid2, z,
                levels = np.arange(start = -50, stop = 50, step = 5),
                colors = 'black')
plt.clabel(p, inline=1, fontsize=10)

plt.scatter(X,Y, c='red', s=100)
plt.plot(X,Y, c='red')
plt.show()
Point of Convergence:  [[ 0.08358509]
 [-0.02219723]]
No. of iterations 506
In [57]:
# accept the quadratic eq
A = np.matrix('1,2;2,8')

# accept x y values
x_0 = np.matrix('9;-10')

A = A + (1*np.eye(2))
def f(x_0):
    f = -(x_0.T @ A @ x_0) 
    return(f)

def grad(f):
    dv = -(A.T + A) @ f
    return(dv)
X,Y = grad_desc_2d(x_0, 0.01, 1)

n = 30
ex1 = np.linspace(-10, 10, num=n)
ex2 = np.linspace(-10, 10, num=n)
grid1, grid2 = np.meshgrid(ex1, ex2)
## fill the grid via looping.  You may prefer to use ufuncs instead.
z = np.empty_like(grid1)
for i in range(z.shape[0]):
    for j in range(z.shape[1]):
        
        x = np.array([grid1[i,j], grid2[i,j]])
        z[i,j] = f(x)

        ## Make the plot
plt.figure(figsize=(9,9))


p = plt.contour(grid1, grid2, z,
                levels = np.arange(start = -50, stop = 50, step = 5),
                colors = 'black')
plt.clabel(p, inline=1, fontsize=10)

plt.scatter(X,Y, c='red', s=100)
plt.plot(X,Y, c='red')
plt.show()
Point of Convergence:  [[ 0.00824297]
 [-0.00218904]]
No. of iterations 241
In [58]:
# accept the quadratic eq
A = np.matrix('1,2;2,8')

# accept x y values
x_0 = np.matrix('9;-10')

A = A + (100*np.eye(2))
def f(x_0):
    f = -(x_0.T @ A @ x_0) 
    return(f)

def grad(f):
    dv = -(A.T + A) @ f
    return(dv)
X,Y = grad_desc_2d(x_0, 0.001, 2)

n = 30
ex1 = np.linspace(-10, 10, num=n)
ex2 = np.linspace(-10, 10, num=n)
grid1, grid2 = np.meshgrid(ex1, ex2)
## fill the grid via looping.  You may prefer to use ufuncs instead.
z = np.empty_like(grid1)
for i in range(z.shape[0]):
    for j in range(z.shape[1]):
        
        x = np.array([grid1[i,j], grid2[i,j]])
        z[i,j] = f(x)

        ## Make the plot
plt.figure(figsize=(9,9))


p = plt.contour(grid1, grid2, z,
                levels = np.arange(start = -4000, stop = 4000, step = 200),
                colors = 'black')
plt.clabel(p, inline=1, fontsize=10)

plt.scatter(X,Y, c='red', s=100)
plt.plot(X,Y, c='red')
plt.show()
Point of Convergence:  [[ 7.38402153e-09]
 [-2.74029625e-09]]
No. of iterations 94

13. Comments on the results

Hence we have the following observations:

1. As the alpha value increases
    a. The condition number decreases
    b. The nunber of iterations decrease  

2. Outliers on different distributions:

Normal Distribution

In [103]:
def normalrandomsamples(R, N):
    # R =1000. We make R samples of size 10. and make a list which holds the mean of each sample.
    mean_of_samples = list()
    for i in range(0,R):
        sample = np.random.normal(loc = 0.0, scale = 2, size = N)
        mean_of_samples.append(np.mean(sample))
    mean_of_means = np.mean(mean_of_samples)
    std_dev = np.std(mean_of_samples)
    confint = np.percentile(mean_of_samples, 10)
    print('For sample size',N,'&',R,'repetitions:-')
    print('The mean is:',mean_of_means)
    print('The standard deviation is:',std_dev)
    print('The percentile  value is:',confint)
In [104]:
normalrandomsamples(1000,10)
For sample size 10 & 1000 repetitions:-
The mean is: 0.03604486314343281
The standard deviation is: 0.6273455439608099
The percentile  value is: -0.7401556117907153
In [110]:
normalrandomsamples(1000,1000)
For sample size 1000 & 1000 repetitions:-
The mean is: -0.0015981087756172733
The standard deviation is: 0.0636186149746153
The percentile  value is: -0.07753834627831047
In [111]:
normalrandomsamples(1000,100000)
For sample size 100000 & 1000 repetitions:-
The mean is: -0.0002594629418735361
The standard deviation is: 0.00608825801388009
The percentile  value is: -0.008180593837872193

Here, we observe that larger sample sizes result in more accurate intervals as the mean, stdev keep on decreasing.

2.2 Pareto Distribution

We first plot the Pareto distribution

In [122]:
data = np.random.pareto(a = 1, size=1000)
plt.hist(data, bins=np.logspace(np.log10(1),np.log10(10), 50))
plt.gca().set_xscale("log")
plt.show()
In [120]:
data = np.random.pareto(a = 1, size=10000)
plt.loglog(data)
plt.show()
In [145]:
def paretorandomsamples(a,R,N):
    # R =1000. We make R samples of size 10. and make a list which holds the mean of each sample.
    mean_of_samples = list()
    for i in range(0,R):
        sample = np.random.pareto(a=a, size = N)+1
        mean_of_samples.append(np.mean(sample))
    mean_of_means = np.mean(mean_of_samples)
    std_dev = np.std(mean_of_samples)
    confint = np.percentile(mean_of_samples, 10)
    print('For sample size',N,'&',R,'repetitions:-')
    print('The mean is:',mean_of_means)
    print('The standard deviation is:',std_dev)
    print('The percentile  value is:',confint)
In [155]:
paretorandomsamples(1,1000,10)
For sample size 10 & 1000 repetitions:-
The mean is: 8.974471701912503
The standard deviation is: 29.994412581059553
The percentile  value is: 2.2152137329984556
In [158]:
paretorandomsamples(1,1000,1000)
For sample size 1000 & 1000 repetitions:-
The mean is: 16.03457533809504
The standard deviation is: 52.53118013237031
The percentile  value is: 6.21058625006596
In [159]:
paretorandomsamples(1,1000,100000)
For sample size 100000 & 1000 repetitions:-
The mean is: 23.78298940163628
The standard deviation is: 185.97848677388592
The percentile  value is: 10.850735314023188

Here, we see that the mean, standard deviation and percentile value increase with number of samples.